XAI Questionnaire Analyser¶

The following notebook has been created to analyse the results from the XAI questionnaire titled "Survey of the interpretability of decision trees", available at this Github repository. The goal of the questionnaire was to evaluate the interpretability of decision trees.

The notebook is divided into sections, and each of them takes its name from those present in the XAI questionnaire that will analyze this notebook. Each section will explain what information was provided to the survey participants and highlight the results obtained.

Suggestion on How to Run the Notebook (If run in a notebook environment)¶

We suggest you to use the "Run all" option of the notebook interpreter instead of running a cell at a time. The option is available in the menu bar in Jupyter environment in Run -> Run All Cells.


Table of Contents¶

  • Dataset
  • Explanation
  • Comprehension Test Results
  • Questions Section Results
    • Question 1
    • Question 2
    • Question 3
    • Question 4
    • Question 5
  • Participants' Demographic Analysis
    • Particants Gender
    • Participants Age
    • Participants Education Level
    • Participants English Level

Dataset ¶

The data used in this questionnaire is part of the wine dataset provided by UCI. These data are the results of a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. The analysis determined the quantities of 13 constituents found in each of the three types of wines. Below, you can find the first ten rows of the dataset.

Alcohol MalicAcid Ash AlcalinityOfAsh Magnesium TotalPhenols flavanoids NonflavanoidsPhenols Proanthocyanins ColorIntensity Hue OD280-OD315 Proline Class
0 14.23 1.71 2.43 15.6 127.0 2.80 3.06 0.28 2.29 5.64 1.04 3.92 1065.0 1
1 13.20 1.78 2.14 11.2 100.0 2.65 2.76 0.26 1.28 4.38 1.05 3.40 1050.0 1
2 13.16 2.36 2.67 18.6 101.0 2.80 3.24 0.30 2.81 5.68 1.03 3.17 1185.0 1
3 14.37 1.95 2.50 16.8 113.0 3.85 3.49 0.24 2.18 7.80 0.86 3.45 1480.0 1
4 13.24 2.59 2.87 21.0 118.0 2.80 2.69 0.39 1.82 4.32 1.04 2.93 735.0 1
5 14.20 1.76 2.45 15.2 112.0 3.27 3.39 0.34 1.97 6.75 1.05 2.85 1450.0 1
6 14.39 1.87 2.45 14.6 96.0 2.50 2.52 0.30 1.98 5.25 1.02 3.58 1290.0 1
7 14.06 2.15 2.61 17.6 121.0 2.60 2.51 0.31 1.25 5.05 1.06 3.58 1295.0 1
8 14.83 1.64 2.17 14.0 97.0 2.80 2.98 0.29 1.98 5.20 1.08 2.85 1045.0 1
9 13.86 1.35 2.27 16.0 98.0 2.98 3.15 0.22 1.85 7.22 1.01 3.55 1045.0 1

Explanation ¶

The XAI questionnaire was presented to the evaluation participants with two representations of the same domain of interest. The two representations were in the form of a decision tree, and they differ in dimension/depth. The decision tree representations were:

3-Layers Decision Tree¶

The following model has a prediction accuracy of 0.927.

G cluster_legend node2 2022-02-25T15:32:13.544107 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node5 2022-02-25T15:32:13.620037 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ leaf3 2022-02-25T15:32:14.037843 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node2->leaf3 leaf4 2022-02-25T15:32:14.061798 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node2->leaf4 leaf6 2022-02-25T15:32:14.082364 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node5->leaf6 leaf7 2022-02-25T15:32:14.105836 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node5->leaf7 node1 2022-02-25T15:32:13.694470 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node1->node2 node1->node5 node8 2022-02-25T15:32:13.920348 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node9 2022-02-25T15:32:13.767909 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node12 2022-02-25T15:32:13.842257 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ leaf10 2022-02-25T15:32:14.262265 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node9->leaf10 leaf11 2022-02-25T15:32:14.282502 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node9->leaf11 leaf13 2022-02-25T15:32:14.302530 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node12->leaf13 leaf14 2022-02-25T15:32:14.322630 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node12->leaf14 node8->node9 node8->node12 node0 2022-02-25T15:32:13.993270 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node0->node1 ≤ node0->node8 > legend 2022-02-25T15:32:13.463244 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/

 5-Layers Decision Tree¶

The following model has a prediction accuracy of 0.978.

G cluster_legend node3 2022-02-25T15:29:19.687655 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node6 2022-02-25T15:29:19.762574 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ leaf4 2022-02-25T15:29:20.568563 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node3->leaf4 leaf5 2022-02-25T15:29:20.590347 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node3->leaf5 leaf7 2022-02-25T15:29:20.612060 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node6->leaf7 leaf8 2022-02-25T15:29:20.633003 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node6->leaf8 node2 2022-02-25T15:29:19.835076 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node2->node3 node2->node6 node9 2022-02-25T15:29:20.063281 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node13 2022-02-25T15:29:19.907964 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ leaf14 2022-02-25T15:29:20.694129 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node13->leaf14 leaf15 2022-02-25T15:29:20.714309 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node13->leaf15 node11 2022-02-25T15:29:19.981697 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node11->node13 leaf12 2022-02-25T15:29:20.673706 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node11->leaf12 node9->node11 leaf10 2022-02-25T15:29:20.653432 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node9->leaf10 node1 2022-02-25T15:29:20.138320 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node1->node2 node1->node9 node16 2022-02-25T15:29:20.354912 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node17 2022-02-25T15:29:20.210739 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node20 2022-02-25T15:29:20.282692 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ leaf18 2022-02-25T15:29:20.734428 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node17->leaf18 leaf19 2022-02-25T15:29:20.754585 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node17->leaf19 leaf21 2022-02-25T15:29:20.774719 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node20->leaf21 leaf22 2022-02-25T15:29:20.794800 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node20->leaf22 node16->node17 node16->node20 node0 2022-02-25T15:29:20.526679 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node0->node1 ≤ node0->node16 > legend 2022-02-25T15:29:19.607851 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/

Comprehension Test Results ¶

The comprehension test section aims to test the notions acquired by the participants of the questionnaire and to verify the goodness of the participants' mental model. The test is constructed around the sample with the following features.

Alcohol MalicAcid Ash AlcalinityOfAsh Magnesium TotalPhenols flavanoids NonflavanoidsPhenols Proanthocyanins ColorIntensity Hue OD280-OD315 Proline
13.2 1.78 2.14 11.2 100.0 2.65 2.76 0.26 1.28 4.38 1.05 3.4 1050.0

The questionnaire participants have to answer two questions based on the visual explanation that represents the model in the form of a decision tree. The visual explanation (3-Layers Decision Tree) provided is presented below.

G cluster_legend node2 2022-02-25T15:32:13.544107 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node5 2022-02-25T15:32:13.620037 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ leaf3 2022-02-25T15:32:14.037843 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node2->leaf3 leaf4 2022-02-25T15:32:14.061798 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node2->leaf4 leaf6 2022-02-25T15:32:14.082364 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node5->leaf6 leaf7 2022-02-25T15:32:14.105836 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node5->leaf7 node1 2022-02-25T15:32:13.694470 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node1->node2 node1->node5 node8 2022-02-25T15:32:13.920348 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node9 2022-02-25T15:32:13.767909 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node12 2022-02-25T15:32:13.842257 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ leaf10 2022-02-25T15:32:14.262265 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node9->leaf10 leaf11 2022-02-25T15:32:14.282502 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node9->leaf11 leaf13 2022-02-25T15:32:14.302530 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node12->leaf13 leaf14 2022-02-25T15:32:14.322630 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node12->leaf14 node8->node9 node8->node12 node0 2022-02-25T15:32:13.993270 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/ node0->node1 ≤ node0->node8 > legend 2022-02-25T15:32:13.463244 image/svg+xml Matplotlib v3.5.0, https://matplotlib.org/

The two questions posed to the participants of the questionnaire are the following:

  • Q1: Which class correspond the wine with the following features? Correct Answer: Class 2
  • Q2: Which of the following features/attributes did you consider for the classification? Correct Answer: Proline, OD280/OD315, and Flavanoids

The Comprehension Test section results are:

  • The Q1 accuracy is 35.0
  • The Q2 accuracy is 44.0
  • Considering both Q1 and Q2, the overall accuracy is 0.312

Questions Section Results ¶

The "Questions" section of this XAI questionnaire was structured to test the "Transparency" aspect of the explanation provided to the evaluation participants. To assess the "Transparency" aspect of the provided explanations, this section was structured to satisfy the requirements of performing a 'Forward Simulation" task. In Forward Simulation tasks, participants are provided with an input and an explanation to ask them to predict the system's output. The section was composed of five questions, and in each of them, the participants had to answer only one question composed of three choices (the three classes of the dataset). The questions and the explanation in each of them were presented to the participants randomly.

This section will show the prediction accuracy of the participants and the time taken by the participants to answer the questions by taking into account the explanation they received during the evaluation.

NB:

If the data for one of the two explanations would miss, the graph associated with it will not be displayed.

Question 1 ¶

The sample used in question 1 is from the row 0 of the dataset and it has the following features:

Alcohol MalicAcid Ash AlcalinityOfAsh Magnesium TotalPhenols flavanoids NonflavanoidsPhenols Proanthocyanins ColorIntensity Hue OD280-OD315 Proline
14.23 1.71 2.43 15.6 127.0 2.8 3.06 0.28 2.29 5.64 1.04 3.92 1065.0

The correct classificiation of the sample was class 1. Here the results obtained:

  • Prediction Accuracy on Explanation 1: 0
  • Time Taken to answer the question with Explanation 1: Mean = 16.943, Median = 18.695, Standard Deviation: 5.891
  • Prediction Accuracy on Explanation 2: 1
  • Time Taken to answer the question with Explanation 2: Mean = 15.203, Median = 13.61, Standard Deviation: 6.545

Question 2 ¶

The sample used in question 2 is from the row 58 of the dataset and it has the following features:

Alcohol MalicAcid Ash AlcalinityOfAsh Magnesium TotalPhenols flavanoids NonflavanoidsPhenols Proanthocyanins ColorIntensity Hue OD280-OD315 Proline
12.37 1.13 2.16 19.0 87.0 3.5 3.1 0.19 1.87 4.45 1.22 2.87 420.0

The correct classificiation of the sample was class 2. Here the results obtained:

  • Prediction Accuracy on Explanation 1: 0
  • Time Taken to answer the question with Explanation 1: Mean = 13.308, Median = 10.758, Standard Deviation: 5.674
  • Prediction Accuracy on Explanation 2: 0
  • Time Taken to answer the question with Explanation 2: Mean = 17.244, Median = 16.908, Standard Deviation: 6.519

Question 3 ¶

The sample used in question 3 is from the row 156 of the dataset and it has the following features:

Alcohol MalicAcid Ash AlcalinityOfAsh Magnesium TotalPhenols flavanoids NonflavanoidsPhenols Proanthocyanins ColorIntensity Hue OD280-OD315 Proline
13.71 5.65 2.45 20.5 95.0 1.68 0.61 0.52 1.06 7.7 0.64 1.74 740.0

The correct classificiation of the sample was class 3. Here the results obtained:

  • Prediction Accuracy on Explanation 1: 0
  • Time Taken to answer the question with Explanation 1: Mean = 14.467, Median = 13.886, Standard Deviation: 4.998
  • Prediction Accuracy on Explanation 2: 0
  • Time Taken to answer the question with Explanation 2: Mean = 15.666, Median = 15.389, Standard Deviation: 7.113

Question 4 ¶

The sample used in question 4 is from the row 164 of the dataset and it has the following features:

Alcohol MalicAcid Ash AlcalinityOfAsh Magnesium TotalPhenols flavanoids NonflavanoidsPhenols Proanthocyanins ColorIntensity Hue OD280-OD315 Proline
13.74 1.67 2.25 16.4 118.0 2.6 2.9 0.21 1.62 5.85 0.92 3.2 1060.0

The correct classificiation of the sample was class 1. Here the results obtained:

  • Prediction Accuracy on Explanation 1: 0
  • Time Taken to answer the question with Explanation 1: Mean = 15.213, Median = 14.125, Standard Deviation: 5.99
  • Prediction Accuracy on Explanation 2: 0
  • Time Taken to answer the question with Explanation 2: Mean = 18.048, Median = 18.629, Standard Deviation: 5.29

Question 5 ¶

The sample used in question 5 is from the row 177 of the dataset and it has the following features:

Alcohol MalicAcid Ash AlcalinityOfAsh Magnesium TotalPhenols flavanoids NonflavanoidsPhenols Proanthocyanins ColorIntensity Hue OD280-OD315 Proline
13.4 3.91 2.48 23.0 102.0 1.8 0.75 0.43 1.41 7.3 0.7 1.56 750.0

The correct classificiation of the sample was class 3. Here the results obtained:

  • Prediction Accuracy on Explanation 1: 0
  • Time Taken to answer the question with Explanation 1: Mean = 15.527, Median = 15.699, Standard Deviation: 6.209
  • Prediction Accuracy on Explanation 2: 0
  • Time Taken to answer the question with Explanation 2: Mean = 14.618, Median = 15.225, Standard Deviation: 5.166

Participants' Demographic Analysis ¶

The questionnaire was completed by 61 participants. Below you will find some information about them.

Participants Gender ¶

Participants of the evaluation could select one of the following choices:

  • Male
  • Female
  • Other
  • Prefer not to say

Participants Age ¶

Participants of the evaluation could select one of the following choices:

  • 18-20
  • 21-29
  • 30-39
  • 40-49
  • 50-59
  • 60 or older

Participants Education Level ¶

Participants of the evaluation could select one of the following choices:

  • Less than high school degree
  • High school degree or equivalent
  • Undergraduate
  • Graduate

Participants English Level ¶

Participants of the evaluation could select one of the following choices:

  • Beginner (A1)
  • Elementary (A2)
  • Lower Intermidiate (B1)
  • Upper Intermidiate (B2)
  • Advanced (C1)
  • Proficient (C2)